voronoi cell
ACombinatorialAlgorithmfortheSemi-Discrete OptimalTransportProblem
In the semi-discrete2-Wasserstein problem, we wish to compute the cheapest way to transport all the mass from a continuous distribution µ to a discrete distributionν in Rd for d 1, where the cost of transporting unitmassbetween pointsaandbisd(a,b)= a b 2. When both distributions are discrete, a simple combinatorial framework has been used to find the exact solution (see e.g.
Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps
Hai, Do Tien, Mai, Trung Nguyen, Nguyen, TrungTin, Ho, Nhat, Nguyen, Binh T., Drovandi, Christopher
We develop a unified statistical framework for softmax-gated Gaussian mixture of experts (SGMoE) that addresses three long-standing obstacles in parameter estimation and model selection: (i) non-identifiability of gating parameters up to common translations, (ii) intrinsic gate-expert interactions that induce coupled differential relations in the likelihood, and (iii) the tight numerator-denominator coupling in the softmax-induced conditional density. Our approach introduces Voronoi-type loss functions aligned with the gate-partition geometry and establishes finite-sample convergence rates for the maximum likelihood estimator (MLE). In over-specified models, we reveal a link between the MLE's convergence rate and the solvability of an associated system of polynomial equations characterizing near-nonidentifiable directions. For model selection, we adapt dendrograms of mixing measures to SGMoE, yielding a consistent, sweep-free selector of the number of experts that attains pointwise-optimal parameter rates under overfitting while avoiding multi-size training. Simulations on synthetic data corroborate the theory, accurately recovering the expert count and achieving the predicted rates for parameter estimation while closely approximating the regression function. Under model misspecification (e.g., $ε$-contamination), the dendrogram selection criterion is robust, recovering the true number of mixture components, while the Akaike information criterion, the Bayesian information criterion, and the integrated completed likelihood tend to overselect as sample size grows. On a maize proteomics dataset of drought-responsive traits, our dendrogram-guided SGMoE selects two experts, exposes a clear mixing-measure hierarchy, stabilizes the likelihood early, and yields interpretable genotype-phenotype maps, outperforming standard criteria without multi-size training.
- Asia > Middle East > Jordan (0.04)
- Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Cellular Learning: Scattered Data Regression in High Dimensions via Voronoi Cells
I present a regression algorithm that provides a continuous, piecewise-smooth function approximating scattered data. It is based on composing and blending linear functions over Voronoi cells, and it scales to high dimensions. The algorithm infers Voronoi cells from seed vertices and constructs a linear function for the input data in and around each cell. As the algorithm does not explicitly compute the Voronoi diagram, it avoids the curse of dimensionality. An accuracy of around 98.2% on the MNIST dataset with 722,200 degrees of freedom (without data augmentation, convolution, or other geometric operators) demonstrates the applicability and scalability of the algorithm.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Somerville (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)
- Asia > Middle East > Jordan (0.05)
- North America > United States (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- North America > United States > District of Columbia > Washington (0.04)
Distributed Multi-robot Source Seeking in Unknown Environments with Unknown Number of Sources
Chen, Lingpeng, Kailas, Siva, Deolasee, Srujan, Luo, Wenhao, Sycara, Katia, Kim, Woojun
We introduce a novel distributed source seeking framework, DIAS, designed for multi-robot systems in scenarios where the number of sources is unknown and potentially exceeds the number of robots. Traditional robotic source seeking methods typically focused on directing each robot to a specific strong source and may fall short in comprehensively identifying all potential sources. DIAS addresses this gap by introducing a hybrid controller that identifies the presence of sources and then alternates between exploration for data gathering and exploitation for guiding robots to identified sources. It further enhances search efficiency by dividing the environment into Voronoi cells and approximating source density functions based on Gaussian process regression. Additionally, DIAS can be integrated with existing source seeking algorithms. We compare DIAS with existing algorithms, including DoSS and GMES in simulated gas leakage scenarios where the number of sources outnumbers or is equal to the number of robots. The numerical results show that DIAS outperforms the baseline methods in both the efficiency of source identification by the robots and the accuracy of the estimated environmental density function.
- Asia > China (0.14)
- North America > United States > Illinois (0.14)
- Europe (0.14)
Convergence Rates for Softmax Gating Mixture of Experts
Nguyen, Huy, Ho, Nhat, Rinaldo, Alessandro
Mixture of experts (MoE) has recently emerged as an effective framework to advance the efficiency and scalability of machine learning models by softly dividing complex tasks among multiple specialized sub-models termed experts. Central to the success of MoE is an adaptive softmax gating mechanism which takes responsibility for determining the relevance of each expert to a given input and then dynamically assigning experts their respective weights. Despite its widespread use in practice, a comprehensive study on the effects of the softmax gating on the MoE has been lacking in the literature. To bridge this gap in this paper, we perform a convergence analysis of parameter estimation and expert estimation under the MoE equipped with the standard softmax gating or its variants, including a dense-to-sparse gating and a hierarchical softmax gating, respectively. Furthermore, our theories also provide useful insights into the design of sample-efficient expert structures. In particular, we demonstrate that it requires polynomially many data points to estimate experts satisfying our proposed \emph{strong identifiability} condition, namely a commonly used two-layer feed-forward network. In stark contrast, estimating linear experts, which violate the strong identifiability condition, necessitates exponentially many data points as a result of intrinsic parameter interactions expressed in the language of partial differential equations. All the theoretical results are substantiated with a rigorous guarantee.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > California > Alameda County > Hayward (0.04)
- (3 more...)
- Research Report (0.49)
- Workflow (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)
On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation
Diep, Nghiem T., Nguyen, Huy, Nguyen, Chau, Le, Minh, Nguyen, Duy M. H., Sonntag, Daniel, Niepert, Mathias, Ho, Nhat
The LLaMA-Adapter has recently emerged as an efficient fine-tuning technique for LLaMA models, leveraging zero-initialized attention to stabilize training and enhance performance. However, despite its empirical success, the theoretical foundations of zero-initialized attention remain largely unexplored. In this paper, we provide a rigorous theoretical analysis, establishing a connection between zero-initialized attention and mixture-of-expert models. We prove that both linear and non-linear prompts, along with gating functions, can be optimally estimated, with non-linear prompts offering greater flexibility for future applications. Empirically, we validate our findings on the open LLM benchmarks, demonstrating that non-linear prompts outperform linear ones. Notably, even with limited training data, both prompt types consistently surpass vanilla attention, highlighting the robustness and adaptability of zero-initialized attention.
- Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (4 more...)